*************** READ ME ******************


These materials supplement the work described in:

Freshwater habitats promote rapid rates of phenotypic evolution in sculpin fishes (Perciformes: Cottoidea)

by

Thaddaeus J. Buser*, Olivier Larouche, Andres Aguilar, Mayara P. Neves, Michael Sandel, Brian L. Sidlauskas, Adam P. Summers, and Kory M. Evans

*Corresponding author: Thaddaeus.Buser@gmail.com


in

The American Naturalist


ABSTRACT
The invasion of freshwater habitats by marine fishes is an exceptional case of habitat-driven biological diversification. Freshwater habitats make up less than 1% of aquatic habitats but contain ~50% of fish species. However, while the dominant group of freshwater fishes (i.e., Otophysi) is older than that of marine fishes (i.e., Percomorphaceae), it is less morphologically diverse. Unequal morphological diversification between two clades is classically explained by two phenomena: differences in the tempo and/or differences in mode of evolution between the two groups. We tested for evidence of these two phenomena in the superfamily Cottoidea (sculpins), which contains substantial radiations of both marine and freshwater fishes. We find that the morphology of freshwater sculpins evolves faster but under higher constraint than that of marine sculpins, causing widespread convergence in freshwater sculpins and higher incidence of morphological novelty in marine sculpins. The endemic freshwater sculpins of Lake Baikal, Siberia, are exceptions, and there we observed high levels of novelty more akin to that of the marine environment. There are several tantalizing explanations for these findings, such as differences in habitat stability and/or habitat connectivity between marine and freshwater systems. 



Contents and description:

Supplementary Figures and captions contains all supplementary figures and respective captions.

Supplementary_Rfolder contains an annotated R script and all necessary raw data to perform all analyses conducted in R.
	The session info for all analyses is as follows:
		R version 4.2.3 (2023-03-15 ucrt)
		Platform: x86_64-w64-mingw32/x64 (64-bit)
		Running under: Windows 10 x64 (build 19045)
		attached base packages:
		  [1] stats     graphics  grDevices utils     datasets  methods   base     
		other attached packages:
		   [1] abind_1.4-5         cluster_2.1.4       randomcoloR_1.1.0.1 recexcavAAR_0.3.0   kriging_1.2        
		   [6] stringi_1.7.12      Rvcg_0.22.1         fishtree_0.3.4      geiger_2.0.10       phytools_1.2-0     
		   [11] maps_3.4.1          ape_5.6-2           Morpho_2.11         geomorph_4.0.5      Matrix_1.5-3       
		   [16] rgl_1.0.1           RRPP_1.3.1  


SupplementarySlicerScene contains slicer scene and all constituent files for duplicating views of landmarks shown in Supplementary Figures and captions file. Requires the program 3D Slicer with the SlicerMorph extension intalled for optimal viewing. 3D Slicer is free and cross platform (https://download.slicer.org/).


SupplementaryTable1 shows the taxononomy, museum Idientifier, scanning parameters, and MorphoSource identifiers for all specimens. Columns are as follows: Family: taxonomic family. Species: Scientific name of species. Genus: taxonomic genus. ScanID: unique identifier of the CT scan session from which a given specimen was taken. number: museum voucher number of a specimen. Collection: abbreviation of musuem at which specimen voucher is curated. Standard Length (mm): the standard length in mm of a specimen. kV: the voltage setting in kilovolts of the CT scan. uA: the amperege setting in microamperes of the CT scan. Filter: the x-ray filter material used for the CT scan. Detector: the number of pixels used in the x-ray detector of the CT scan. rotation: the rotation in degrees of the specimen at each rotation step of the CT scan. Voxel size: the size in micrometers represented by each voxel in the reconstructed CT scan. MorphoSourceID: the unique MorphoSource identifier of each specimen deposited on MorphoSource.org.


SupplementaryTable2 contains anatomical descriptions for all landmarks and semilandmarks used in the study. Rows are partitioned into three sections: LANDMARKS, CURVE DESCRIPTIONS,  and SEMILANDMARKS. LANDMARKS contains the data for all anatomical landmarks. Columns in this section are as follows: Bone: the skeletal unit (e.g., bone) on which the landmarks are located. Definition: the anatomical description of the landmark. LHS Landmark #: the sequential number assigned to the landmark as it appears on the Left Hand Side (LHS) of the specimen. RHS Landmark #: the sequential number assigned to the landmark as it appears on the Righ Hand Side (RHS) of the specimen. CURVE DESCRIPTIONS contains anatomical descriptions of each of the anatomical curves. Columns in the section are as follows: Curve #: the sequential number of each curve. Bone: the skeletal unit on which the curve is located. Definition: anatomical description of the curve. Beginning landmark: the number of the landmark that describes the beginning (one terminal point) of the curve. The LHS landmark is first, followed by the homologous landmark on the RHS in parentheses. Ending Landmark: the number of the landmark that describes the end (other terminal point) of the curve. The LHS landmark is first, followed by the homologous landmark on the RHS in parentheses. # of semilandmarks: the number of semilandmarks making up the curve between the landmarks that define the end points. SEMILANDMARKS contains the information for each semilandmark. Columns in the section are as follows: Bone: skeletal unit on which the semilandmark occurs. Curve #: the curve number, which corresponds to the Curve # column in the CURVE DESCRIPTIONS section. LHS Landmark #: the sequential number of each semilandmark on the LHS of the specimen. RHS Landmark #: the sequential number of each semilandmark on the RHS of the specimen.


SupplementaryTable3 shows model comparisons for the analysis of skull shape evolution in sculpins. Columns are as follows: Model: the model of morphological evolution. BM= Brownian Motion, OU= Ornstein Uhlenbeck. See text for additional details. Log Marginal Likelihood: the log marginal likelihood of that model, given our data. Log Bayes Factor: the log Bayes factor of that model, given our data. BM= Brownian Motion, OU= Ornstein Uhlenbeck.


SupplementaryTable4 shows phylogenetic net morphological evolutionary rate, product-based lineage density, and sum-based lineage density for each grouping factor in the dataset. We used two alternate grouping factors: 1) Taxonomic Family and 2) Ecosystem. The rows of the table are partition such that the values calculated when grouping by taxonomic family are presented first, followed by the rows containing the values calculated when grouping by Ecosystem. All calculations were were then performed twice: once including the Baikal sculpins in the family Cottidae and once excluding the Baikal sculpins. When grouped by taxonomic family, Cottidae had the highest net rate of morphological evolution and the highest lineage density, when grouped by ecosystem, freshwater sculpins had the highest net rate of morpholgical evoluiton and the highest lineage density (indicated in bold text), regardless of whether the Baikal taxa were included. Columns are partitioned by BAIKAL PRESENT, which indicates that the Baikal taxa were included in the anlysis, and BAIKAL ABSENT, which indicates that the Baikal taxa were exculded from the analysis. Taxonomic Family column indicates taxonomic family of the test group. Phylogenetic net evolutionary rate is the net rate of morphological evolution for the group given the phylogenetic hypothesis. Lineage Density (product-based) is the lineage density of a group derived from the first lineage density equation presented in Sidlauskas (2008), which is product based. Lineage Density (sum-based) is the lineage density of a group derived from the second lineage density equation presented in Sidlauskas (2008), which is sum-based. See main text for additional detials.  


SupplementaryTable5 is a summary of Analysis of Variance for evolutionary allometry analysis using log maximum body size for each species, using Residual Randomization Permutation procedure: Randomization of null model residuals. Number of permutations: 1000. Estimation method: Generalized Least-Squares (via OLS projection). Sums of Squares and Cross-products: Type II. Effect sizes (Z) based on F distributions. Call: procD.lm(f1 = coords ~ log(size) * family, iter = iter, seed = seed, RRPP = TRUE, SS.type = SS.type, effect.type = effect.type, int.first = int.first, Cov = Cov, data = data, print.progress = print.progress). Columns are as follows: Df: degrees of freedom. SS: sum of squared error. MS: mean squared error. Rsq: R-squared value. F: F-statistic. Z: Z-statistic (effect size). Pr(>F): p-value.


SupplementaryTable6 is a summary of Analysis of Variance for evolutionary allometry analysis using log average centroid size for the specimens representing a species, using Residual Randomization Permutation procedure: Randomization of null model residuals. Number of permutations: 1000. Estimation method: Generalized Least-Squares (via OLS projection). Sums of Squares and Cross-products: Type II. Effect sizes (Z) based on F distributions. Call: procD.lm(f1 = coords ~ log(size) * family, iter = iter, seed = seed,  RRPP = TRUE, SS.type = SS.type, effect.type = effect.type, int.first = int.first, Cov = Cov, data = data, print.progress = print.progress). Columns are as follows: Df: degrees of freedom. SS: sum of squared error. MS: mean squared error. Rsq: R-squared value. F: F-statistic. Z: Z-statistic (effect size). Pr(>F): p-value.








